A Feature Selection Method based on Fuzzy Mutual Information for Fuzzy Rule-based Regression Models

نویسندگان

  • Michela Antonelli
  • Pietro Ducange
  • Francesco Marcelloni
  • Armando Segatori
چکیده

Fuzzy rule-based models have been extensively used in regression problems. Besides high accuracy, one of the most appreciated characteristics of these models is their interpretability, which is generally measured in terms of complexity. Complexity is affected by the number of features used for generating the model: the lower the number of features, the lower the complexity. Feature selection can therefore considerably contribute not only to speed up the learning process, but also to improve the interpretability of the final model. Nevertheless, a very few methods for selecting features before rule learning have been proposed in the literature in the framework of regression problems. In this context, we propose a novel forward sequential feature selection approach based on the minimalredundancy-maximal-relevance criterion. The relevance and the redundancy of a feature are measured in terms of, respectively, the fuzzy mutual information between the feature and the output variable, and the average fuzzy mutual information between the feature and the just selected features. The stopping criterion for the sequential selection is based on the average values of relevance and redundancy of the just selected features. We tested our feature selection method performing two experiments on twenty regression datasets. In the first experiment, we aimed to show the effectiveness of our approach by comparing the mean square errors achieved by the fuzzy rule-based models generated using all the features, the features selected by our approach and the features selected ∗Corresponding author, Tel: +39 0502217678 Fax: +39 0502217600 Preprint submitted to Information Science December 23, 2014 by two state-of-the-art feature selection algorithms, respectively. For simplicity, we adopted the well-known Wang and Mendel algorithm for generating the fuzzy rule-based models. We present that the mean square errors obtained by models generated by using the features selected by our approach are on average similar to the values achieved by using all the features and lower than the ones obtained by employing the subset of features selected by the two state-of-the-art feature selection algorithms. In the second experiment, we intended to evaluate how our feature selection algorithm can reduce the convergence time of the evolutionary fuzzy systems, which are probably the most effective fuzzy techniques for tackling regression problems. By using a state-of-the-art multi-objective evolutionary fuzzy system based on rule learning and membership function tuning, we show that the number of evaluations can be reduced of more than 40% when pre-processing the dataset by our feature selection algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mutual information-based feature selection and partition design in fuzzy rule-based classifiers from vague data

Algorithms for preprocessing databases with incomplete and imprecise data are seldom studied. For the most part, we lack numerical tools to quantify the mutual information between fuzzy random variables. Therefore, these algorithms (discretization, instance selection, feature selection, etc.) have to use crisp estimations of the interdependency between continuous variables, whose application to...

متن کامل

A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts

High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...

متن کامل

Bi-criteria Genetic Selection of Bagging Fuzzy Rule-based Multiclassification Systems

Previously we proposed a scheme to generate fuzzy rule-based multiclassification systems by means of bagging, mutual information-based feature selection, and a multicriteria genetic algorithm (GA) for static component classifier selection guided by the ensemble training error. In the current contribution we extend the latter component by the use of two bi-criteria fitness functions, combining t...

متن کامل

NEW CRITERIA FOR RULE SELECTION IN FUZZY LEARNING CLASSIFIER SYSTEMS

Designing an effective criterion for selecting the best rule is a major problem in theprocess of implementing Fuzzy Learning Classifier (FLC) systems. Conventionally confidenceand support or combined measures of these are used as criteria for fuzzy rule evaluation. In thispaper new entities namely precision and recall from the field of Information Retrieval (IR)systems is adapted as alternative...

متن کامل

Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014